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This paper proposes a method in which an object tracking robot system is 
implemented on field programmable gate arrays (FPGAs). The OV7670 
camera provides real-time object pictures to the system. To improve picture 
quality, images are put via the median filter phase. The item is distinguished 
from the backdrop based on color (red), after which it is subjected to a 
mathematical morphological approach of filtering to eliminate noise. To send 
the robot control signals, the object's (new) coordinates are found. In this 
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1. INTRODUCTION 

The practice of tracking an object's movement over time using a camera is called object tracking 
[1]-[3]. Since the locations of the objects are always changing, tracking moving images is a crucial task in 
surveillance systems [4], [5]. Application areas for object recognition and tracking include autonomous robot 
navigation, surveillance, and vehicle navigation [6], [7]. Finding things throughout a series of frames is the 
process of object detection. Object tracking uses a camera to find the items that will gradually appear. A hard 
digital signal processing module, distributed memory, and a number of programmable logic blocks make up a 
field programmable gate array (FPGA), which can handle real-time objects [8], [9]. There are several real- 
world uses for object tracking, including security, surveillance, autonomous driving, automated traffic 
management, biological image analysis, and intelligent robot control [10], [11]. The goal of object tracking, 
like that of the majority of computer vision systems, is to identify and extract the target item from a stream of 
images that the camera continuously records [12]—[14]. Better object tracking is made possible by faster image 
processing computation. In actuality, when object tracking apps are deployed, they are often created on the 
open CV library [15]—[17] and operate on Windows or Linux. As a result, the graphics libraries used in image 
processing make the program's execution speed extremely reliant on the hardware setup, which raises the cost 
of design. Due to the current demand, tracking systems must be as affordable as feasible while still meeting 
the requirements of processing speed, well-handling, accuracy, and reaction time. Because of the quick 
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development time and low cost, while yet meeting the needs of reaction speed and precision, FPGA is the ideal 
solution for developing relativistic tracking algorithms. 

In our study, the system makes use of a morphological noise-removal filter, a median filter, and red color 
to distinguish the target item from background objects. The robot motor is given a control signal by determining 
the object's location. In order to boost the system's speed, the system is constructed on an FPGA utilizing a 
combination of hardware cores and embedded microcontrollers from MicroBlaze. Different from previous work, 
the contributions of this work can be pointed out shortly as; 1) an object tracking robot system based on FPGA is 
constructed, ii) the system captures real-time photos of objects from the OV7670 camera, iii) a mathematical 
morphological method is utilized to remove noise from around the object, and iv) the entire system is run in real- 
time on the Xilinx's spartan-6 FPGA KIT. 

The rest of this paper is organized as follows. Section 2 provides the proposed object tracking system 
including an overview of the tracking system and a description of the algorithms to be used in the system. 
Section 3 describes the structure diagrams of the embedded system on the KIT FPGA and details the IP cores in 
the system. In section 4, some results of the implementation of the tracking system on FPGA and the 
experimental evaluation results are provided. Finally, conclusions and future work are addressed in section 5. 


2. THE PROPOSED OBJECT TRACKING SYSTEM 
2.1. System overview 

First, the camera will provide the system one picture frame. The picture is then subjected to object 
separation image filtering to enhance image quality, red color separation, and eliminate noise-free zones. 
Following object separation, it will ascertain the item's shape and its location with respect to the image's center 
before generating a motor control signal that will cause the robot to move in that direction. A process of the 
object tracking system is shown in Figure 1. 


Filter & Determine 
Separate object 
Images coordinates 


Take Image 
from Camera 


Motor control 


Figure 1. A process of object tracking system 


2.2. Deploying algorithm steps 
2.2.1. Collecting object images 

The OV7670 camera continually recorded photos of objects. The camera's data is delivered by the 
camera in parallel 8-bit frames, and the system configures the camera using the standard serial camera control 
bus (SCCB) [16]. The received picture is a 320 by 240 pixel RGB565 color image [18]. 


2.2.2. Apply filters to separate objects 

The color on red, green, and blue (RGB) images collected from the camera will be applied a median 
filter of size 3x3 to reduce noise. Then, the image will be color separated into a binary image in which only the 
black background and white pixel blocks represent the object to be processed. The binary image is then applied 
a morphological filter to enhance the quality to make it easier to find the contour around the object. 
a. Filter the median 

Boyle's median filter is a popular choice for mild noise reduction (impulse noise). The median filter 
minimizes noise by replacing a pixel's value with the median of the gray levels of nearby pixels because pulse 
noise frequently appears unique, and its gray level value differs significantly from that of its neighbors. The 
core principle of the median filter technique is to employ a mask that scans every pixel of the input picture 
sequence; typically, a mask of sizes 3x3, 5x5, and 7x7 is employed. A 3x3 median filter is shown in Figure 2. 
Take the value of the associated pixels in the mask area at each pixel location and arrange them in ascending 
or descending order. After sorting the range of pixels for the pixel value being taken into consideration for the 
final picture, assign the pixel to the middle (median) of the range [19]. 
b. Color separation of objects 

The color separation block, which is a block that recognizes the object based on the color of the item 
compared to the picture backdrop, is used to filter the noise after obtaining the RGB image. It is important to 
examine the RGB values at several locations across the image in order to distinguish the color of the item. The 
red ball used in this article has an RGB color space value of (255, 0, 0), however the picture captured by the 
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camera has a low resolution and is influenced by light. Light causes the obtained R, G, and B picture area's red 
color to vary around the values 255, 0, 0. The object's color value is equal to 1, the backdrop color returns a 
value of 0, and the thresholds for the color channels R, G, and B are experimentally chosen to separate the 
color. Figure 3 shows the binary output picture that is the end result. 


123 125 126 130 140 123 125 126 130 140 
122 |}124 126 127] 135 126 
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219 }115 119 123 133 2 115 119 


211 116 110 120 130 


Figure 2. An illustration of the median filter: the Figure 3. Image after filtering and color 
median value of its neighboring pixels (3x3) replaced separation 
the current pixel value 


c. Morphological filtering 

Due to the aforementioned circumstances, the object that was removed from the backdrop becomes 
the subject of the survey as well. This includes noise and any interior spaces that were left empty due to 
background confusion. Therefore, the object should be refined by eliminating the noise regions that are not the 
object and filling the vacant spaces inside the object in order to guarantee that the best information is delivered 
for the following blocks of the system. Using mathematical morphology (MM) is one of the techniques used 
to filter items after removing them from the background [20], [21]. 

MM is a set theory-based method for treating geometrical structures. The basic morphological 
procedures used by this approach, which is based on structure and shape, allow the picture to be made simpler 
while preserving the key elements seen in Figure 4 of the original. In order to assess if a given basic block, or 
structural element, fits or misses the form in the picture, the fundamental goal of MM is to find images that 
contain that block. In the case of Figure 4(a), applying dilation helps to connect the dashed points of the image 
that increases the details of the image. Next, in Figure 4(b), the erosion removes groups of pixels that are much 
smaller than the size of the object in the image to remove noisy areas for more accurate object for identification. 
There are 4 basic morphological operations [22] in Figure 5; i) dilation: used to expand or thicken the object 
in the frame, ii) erosion (shrink): used to shrink or thin the object in the frame, iii) opening: opening combines 
an erosion and a dilation with the same structuring element, and iv) closing: closing combines a dilation and 
an erosion with the same structure element. 


a4 
B 
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4 
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Figure 4. An example of applying two MMs on block A, (a) block A after performing the dilation and 
(b) block A after performing the erosion 
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Figure 5. An example of image after performing the openning and closing 


2.2.3. Determine object coordinates 

In Figure 6, the first thing we can find a way around the object. The contour edges are constructed 
based on the minimum distance from the subject to the corresponding sides of the frame. The coordinates of 
the object are determined based on the center of the contour relative to the center of the image frame. From 
there, determine the direction of movement for the robot so that the object returns to the center of gravity. 


Top Edge 


Bottom Edge 


Figure 6. Find the object contour 


3. FPGA EMBEDDED SYSTEM STRUCTURE 
3.1. Embedded system structure diagram 

Figure 7 shows a detailed built-in diagram of the embedded system on the spartan-6 FPGA SP605 
evaluation KIT including: 

— One 32-bit MicroBlaze processor core [23] running at 100 MHz with 32 K of data and instruction memory 
is connected to high-speed computer and peripheral cores through the AX] interface. 

— A single UART controller with a baud rate of 128,000 that can transport from the computer to the board 
the picture to be processed and receive the completed image for display on the computer. 

— The picture to be filtered is stored in external RAM with a maximum memory capacity of 128 MB, which 
is connected to a single core SDRAM controller. 

— IP core for a median filter with two memory FIFOs and one median filter. 

— The IP core does the dilation process, which includes the morphological filtering of 1 math filter and 
2 FIFO memory. The IP core performs erosion math (performs morphological filtering), includes 2 FIFO 
memories and | math filter core erosion. 

— a DMA controller core that uses Xillinx to increase data processing performance while moving data 
between hard-core IP cores and external RAM [24]. 

— The PWM core regulates pulses to operate the robot's motor. 

— Two Xilinx-powered clock sources, one with a 200 Mhz oscillator (both positive and negative side) and the 
other with a 27 Mhz oscillator (single rib) [25]. 

— The system employs the AXI interface, which includes AXI4, AXI4-Lite, and AXI stream [26], to connect 
with MicroBlaze. 
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Figure 7. The embedded system structure diagram on spartan-6 FPGA SP605 evaluation KIT from Xilinx 
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3.2. Structure of hardware cores 

3.2.1. Filter median and color separation of objects 

a. Median filter element 

Figure 8 shows the block diagram of the median and color separation filters including: 

— The median filter and color separation comprise two FIFOs for synchronizing data between two clock 
domains; one data domain is obtained from an AX] stream with a frequency of 100 MHz and the other from 
the internal clock domain of the median filter, which operates at 30 MHz. 

— The mechanism pipeline is responsible for filtering the median, and a stiff core does this. 

— Acontrol_bi block to manage synchronizing the writing of data from the median filter output to the FIFO 
OUT. 

— The work of binaryizing each R, G, and B channel with the necessary thresholds is essentially what makes 
up a color separation block. 

— Consider a 3x3 mask with the pixels sorted in ascending order for each row, then in descending order for 
each row, and lastly sorted diagonally. The median of the diagonal equals the 3x3 mask's median [27]. 

Figure 9 shows the hardware that was constructed using that approach. To compare two 8-bit A and 

B inputs and output the bigger number H and the smaller number L, use the basic node block. Based on the 

fundamental nodes in Figure 10, the aforementioned procedure is used to sort a block and determine the median 


of a 3x 3 mask. 
Data in Median 
C—) FIFO In Filter IP 
V Core 


Figure 8. Block diagram of median filter and color separation 
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Figure 9. Hardware of basic node Figure 10. The block hardware calculates the 3x3 
mask median from the basic node 


b. Build data flow 

In Figure 11, each cycle 3x4 non-overlapping block of the image will be input for median filtering, 
here will build a filter block by pipeline, 3x4 data blocks (12 pixels) are fed consecutively after every clock 
from FIFO in, after 3 cycles there will be filtered data. Each filter cycle will get 4 pixels. Figure 12 is 
morphological filtering IP pipeline architecture describing IP stateful filtering architecture. 


Block_1 Block_2 


Pl P2 P3 P4 Pl P2 


Figure 11. Median filtering IP pipeline architecture 


PLP? : PLP2P3 & 


az 


Col 1 Col? Cot3 Col+ 


vv 


Vvvy 


¥ 


Col 3. Col + 


vvy 


Figure 12. Morphological filtering IP pipeline architecture 


3.2.2. Morphological filtering 

We construct 2 IP cores, including dilation and erosion, based on the homomorphic filtering theory. 
The hardware architecture and I/O data flow for these two IP cores are identical. In Figure 13, the entire picture 
is scanned using a 9x9 structural element. The structural element will travel one pixel at a time throughout the 
whole image, from left to right and top to bottom. A new picture pixel will be generated for each 9x9 block of 
the image that corresponds to the identified 9x9 structural element. As a result, it is clear that, with an image 
size of 320x240 pixels, translating the structure pixel-by-pixel from top to bottom will take a very long time. 
Because of this, the article uses a pipeline computation with the input data stream in each 9x9 picture block in 
Figure 14 to shorten the execution time. After the first 9 cycles since the first data is pushed in, we will get the 
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exact 9 bits of data from Out! to Out9 show at Figure 15. Then every cycle we will have 9 bits of data after 
homomorphic filtering. Therefore, base on the pipeline architecture, the calculation speed of the system is very 
high. 


Figure 13. Morphometric filter calculation 
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Figure 15. Object contour values 


3.2.3. Engine control 

The robot follows the object based on the coordinates and contour of the object as shown in 
Figure 15. Controlling the robot forward or backward is based on the parameters, Yjnjn and Ymax3 
i) B <Ynin — Ymax < a: the robot is stationary, ii) B S Yingx — Yimin: the robot is moving backwards, and 
iii) Ynax — Ymin: the robot is moving forwards. Controlling the robot to turn left or right is based on the 
coordinates of the center of gravity of the object X94, Yon; 1) (Non, Von) © {B U C}: the robot turns left, and 
ii) (%op, Von) € {A U D}: the robot turns right. 
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Figure 16 depicts the IP core for motor control. The motor control core consists of: FSM module and 
PWM module. The FSM module receives signals (cometo, backward, right, left) from the DSP module to 
decide to control the moving robot to follow the object. The output data of the FSM module (driver_1, driver_2) 
is sent to the PWM module to control pulses (signal_1, signal_2) for the two robot motors to rotate at the right 
speed and in the right direction. 


Comto Be 
MOTOR CONTROL IP CORE 


module FSM module PWM 


Figure 16. Motor control IP-core 


4. RESULTS OF IMPLEMENTATION AND ASSESSMENT 
4.1. Hardware synthesis results 

The embedded system implemented on the spartan-6 KIT. It’s used about 80% of the memory 
elements including RAM blocks, LUTs and about 20% of other logic of the spartan-6 KIT. This result shows 
that the design is suitable for resource-constrained systems. 


4.2. Execution time results 


Table 1 shows the execution time results. Following all the algorithm steps, the execution times are 
shown corresponding to each block name. 


Table 1. Execution time results 


No. Block name Execution time (s) 
1 Collect photo frames 1.0 
2 Median filter 1.5 
3 Color separation 0.5 
4 Morphological filter 2.0 


4.3. Performance evaluation results 

To evaluate the system's tracking performance, the team evaluated the system based on the good light 
environment and the robot's ability to follow in the right direction. Table 2 shows the performance evaluation 
of each movement direction. 


Table 2. Performance evaluation 


Light conditions Movement direction Result Efficiency 
Forward (25 times) 23/25 
Back (25 times) 22/25 
God Left (25 times) 22/25 ee 
Right (25 times) 21/25 


5. CONCLUSION AND FUTURE NETWORK 

The real-time object tracking robot control system on Spartan®-6 FPGA SP605 evaluation KIT is 
proposed in this study as being low-cost and low-power. The technology is being tested on the KIT and is 
capable of precisely directing the robot to pursue red objects under various lighting conditions. The system is 
constructed using high-speed pipelined hardware cores. In order to enable the system's high speed operation, 
DMA is also used to transfer data in bursts back and forth between external DDR3 memory and hardware IP 
cores. The system's response time to the movement of the item is adequate. Our upcoming study will focus on 
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implementing robot collaboration using FPGA. Fast real-time robot embedded systems will be crucial in 
assuring the effectiveness of robot collaborative work as robotics adoption in manufacturing rises via work 
alongside humans. 
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